Import Statements
library(NHANES)
Warning: package 'NHANES' was built under R version 4.2.2
library(tidyverse)
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
✔ ggplot2 3.3.6 ✔ purrr 0.3.4
✔ tibble 3.1.8 ✔ dplyr 1.0.10
✔ tidyr 1.2.1 ✔ stringr 1.4.1
✔ readr 2.1.2 ✔ forcats 0.5.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
library(kableExtra)
Warning: package 'kableExtra' was built under R version 4.2.2
Attaching package: 'kableExtra'
The following object is masked from 'package:dplyr':
group_rows
1. (8 pts) Select at least three categorical variables from the
’NHANES‘ data that are not considered in the lecture notes. Be creative
in presenting variation in each and covariation in at least two of
them.
Create the dataset.
as_tibble(NHANES)
df_eda_3_categorical <- select(NHANES, Education, HomeOwn, Work)
df_eda_3_categorical <- filter(df_eda_3_categorical, Education != "NA", HomeOwn != "NA", Work != "NA")
df_eda_3_categorical
The below graph will show the variation of Education. We see that in
this dataset most people have gone to college or hold a degree.
ggplot(data=df_eda_3_categorical) +
geom_bar(mapping = aes(x=Education))

The below graph will show the variation of Home Ownership. We can
see that in this dataset most people own a home.
ggplot(data=df_eda_3_categorical) +
geom_bar(mapping = aes(x=HomeOwn))

The below graph will show the variation of Working Status. We can
see that in this dataset, most people are working, however there are
quite a bit without a job or looking.
ggplot(data=df_eda_3_categorical) +
geom_bar(mapping = aes(x=Work))

Below we are creating a graph that shows the covariation of Level of
Education and Work. We can see that as Education increases so does the
number currently working.
ggplot(data = df_eda_3_categorical) +
geom_bar(mapping = aes(x=Education, fill=Work), position="dodge")

Finally, I am creating a covariation between Education and Home
Ownership. We can see that as education increases more people own a home
and after obtaining a college degree less people rent and even more
own.
ggplot(data = df_eda_3_categorical) +
geom_bar(mapping = aes(x=Education, fill=HomeOwn), position="dodge")

2. (8 pts) Select two continuous variables from the ’NHANES‘ data
that are not considered in the lecture notes. Be creative in presenting
variation in each and covariation in between them by considering the
levels of at least one categorical variables that you picked in
(1).
df_eda_2_continuous <- select(NHANES, DaysMentHlthBad, SleepHrsNight)
df_eda_2_continuous <- filter(df_eda_2_continuous, DaysMentHlthBad != "NA", SleepHrsNight != "NA")
df_eda_2_continuous
Below is a visualization of the variance of the days they had bad
mental health. I set the bandwidth to 5 since there were a lot of values
in between each step this keeps it cleaner. We can see by the below
visualization that most people didn’t have bad mental health.
ggplot(data = df_eda_2_continuous) +
geom_histogram(mapping = aes(x = DaysMentHlthBad), binwidth = 5 )

Next we will show the variance of the number of sleep hours the
participants had at night. I set the bandwidth to one since the
observations were between 0 and 13 we would be able to easily see all
data. We can tell from the below data that most people get between 6 and
8 hours of sleep.
ggplot(data = df_eda_2_continuous) +
geom_histogram(mapping = aes(x = SleepHrsNight), binwidth = 1)

For the next part I will create a dataset with all 5 attributes I am
working with.
df_eda_5_attributes <- select(NHANES, Education, HomeOwn, Work, DaysMentHlthBad, SleepHrsNight)
df_eda_5_attributes <- filter(df_eda_5_attributes, Education != "NA", HomeOwn != "NA", Work != "NA",
DaysMentHlthBad != "NA", SleepHrsNight != "NA")
df_eda_5_attributes
Next I will show a covariance of what effect education has on days
of bad mental health. We can tell there isn’t a large effect other than
having a college degree seems to lesson it some. However there is more
bad mental health with those that dropped
ggplot(data = df_eda_5_attributes, mapping = aes(x = Education, y = DaysMentHlthBad)) +
geom_boxplot()

Below is a table of the descriptive statistics for bad mental health
in regard to education and work.
df_eda_5_attributes_bad_mental_health_stats <- df_eda_5_attributes %>%
filter(!is.na(DaysMentHlthBad)) %>%
group_by(Education, Work) %>%
summarise(mean = mean(DaysMentHlthBad), stdev = sd(DaysMentHlthBad), N = n()) %>%
ungroup() %>%
pivot_wider(names_from = Work, values_from = c(mean, stdev, N)) %>%
select(Education,+ ends_with("College Grad"), everything())
`summarise()` has grouped output by 'Education'. You can override using the
`.groups` argument.
kable(df_eda_5_attributes_bad_mental_health_stats,
caption = "Descriptive Stats for Bad Mental Health vs Education and Work Status",
escape = F,
digits = 3,
longtable = T,
col.names = c("Education", "Mean", "St. deviation", "N", "Mean", "St. deviation", "N",
"Mean", "St. deviation", "N")) %>%
add_header_above(c(" " = 1, "Looking" = 3, "Working" = 3, "Not Working" = 3))
Descriptive Stats for Bad Mental Health vs Education and Work Status
|
|
Looking
|
Working
|
Not Working
|
|
Education
|
Mean
|
St. deviation
|
N
|
Mean
|
St. deviation
|
N
|
Mean
|
St. deviation
|
N
|
|
8th Grade
|
4.667
|
5.485
|
3.347
|
7.230
|
9.569
|
7.961
|
6
|
206
|
150
|
|
9 - 11th Grade
|
7.455
|
6.602
|
4.540
|
11.057
|
10.488
|
8.484
|
33
|
377
|
359
|
|
High School
|
6.000
|
4.610
|
3.780
|
9.248
|
8.377
|
6.959
|
48
|
516
|
794
|
|
Some College
|
5.342
|
5.263
|
4.221
|
8.855
|
9.420
|
7.995
|
79
|
708
|
1247
|
|
College Grad
|
1.580
|
2.693
|
3.294
|
4.049
|
6.475
|
6.559
|
69
|
410
|
1406
|
LS0tDQp0aXRsZTogIkRTQ0kgNjEwIEhXMiBJbnRyb2R1Y3Rpb24gdG8gRURBIg0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCg0KIyMjIEltcG9ydCBTdGF0ZW1lbnRzDQpgYGB7cn0NCmxpYnJhcnkoTkhBTkVTKQ0KbGlicmFyeSh0aWR5dmVyc2UpDQpsaWJyYXJ5KGthYmxlRXh0cmEpDQpgYGANCg0KIyMjIDEuICg4IHB0cykgU2VsZWN0IGF0IGxlYXN0IHRocmVlIGNhdGVnb3JpY2FsIHZhcmlhYmxlcyBmcm9tIHRoZSDigJhOSEFORVPigJggZGF0YSB0aGF0IGFyZSBub3QgY29uc2lkZXJlZCBpbiB0aGUgbGVjdHVyZSBub3Rlcy4gQmUgY3JlYXRpdmUgaW4gcHJlc2VudGluZyB2YXJpYXRpb24gaW4gZWFjaCBhbmQgY292YXJpYXRpb24gaW4gYXQgbGVhc3QgdHdvIG9mIHRoZW0uDQoNCiMjIyMgQ3JlYXRlIHRoZSBkYXRhc2V0Lg0KDQpgYGB7cn0NCmFzX3RpYmJsZShOSEFORVMpDQoNCmRmX2VkYV8zX2NhdGVnb3JpY2FsIDwtIHNlbGVjdChOSEFORVMsIEVkdWNhdGlvbiwgSG9tZU93biwgV29yaykNCmRmX2VkYV8zX2NhdGVnb3JpY2FsIDwtIGZpbHRlcihkZl9lZGFfM19jYXRlZ29yaWNhbCwgRWR1Y2F0aW9uICE9ICJOQSIsIEhvbWVPd24gIT0gIk5BIiwgV29yayAhPSAiTkEiKQ0KZGZfZWRhXzNfY2F0ZWdvcmljYWwNCg0KYGBgDQoNCiMjIyMgVGhlIGJlbG93IGdyYXBoIHdpbGwgc2hvdyB0aGUgdmFyaWF0aW9uIG9mIEVkdWNhdGlvbi4gV2Ugc2VlIHRoYXQgaW4gdGhpcyBkYXRhc2V0IG1vc3QgcGVvcGxlIGhhdmUgZ29uZSB0byBjb2xsZWdlIG9yIGhvbGQgYSBkZWdyZWUuDQpgYGB7cn0NCmdncGxvdChkYXRhPWRmX2VkYV8zX2NhdGVnb3JpY2FsKSArDQogIGdlb21fYmFyKG1hcHBpbmcgPSBhZXMoeD1FZHVjYXRpb24pKQ0KDQpgYGANCg0KIyMjIyBUaGUgYmVsb3cgZ3JhcGggd2lsbCBzaG93IHRoZSB2YXJpYXRpb24gb2YgSG9tZSBPd25lcnNoaXAuIFdlIGNhbiBzZWUgdGhhdCBpbiB0aGlzIGRhdGFzZXQgbW9zdCBwZW9wbGUgb3duIGEgaG9tZS4NCmBgYHtyfQ0KZ2dwbG90KGRhdGE9ZGZfZWRhXzNfY2F0ZWdvcmljYWwpICsNCiAgZ2VvbV9iYXIobWFwcGluZyA9IGFlcyh4PUhvbWVPd24pKQ0KYGBgDQoNCiMjIyMgVGhlIGJlbG93IGdyYXBoIHdpbGwgc2hvdyB0aGUgdmFyaWF0aW9uIG9mIFdvcmtpbmcgU3RhdHVzLiBXZSBjYW4gc2VlIHRoYXQgaW4gdGhpcyBkYXRhc2V0LCBtb3N0IHBlb3BsZSBhcmUgd29ya2luZywgaG93ZXZlciB0aGVyZSBhcmUgcXVpdGUgYSBiaXQgd2l0aG91dCBhIGpvYiBvciBsb29raW5nLg0KYGBge3J9DQpnZ3Bsb3QoZGF0YT1kZl9lZGFfM19jYXRlZ29yaWNhbCkgKw0KICBnZW9tX2JhcihtYXBwaW5nID0gYWVzKHg9V29yaykpDQpgYGANCg0KIyMjIyBCZWxvdyB3ZSBhcmUgY3JlYXRpbmcgYSBncmFwaCB0aGF0IHNob3dzIHRoZSBjb3ZhcmlhdGlvbiBvZiBMZXZlbCBvZiBFZHVjYXRpb24gYW5kIFdvcmsuIFdlIGNhbiBzZWUgdGhhdCBhcyBFZHVjYXRpb24gaW5jcmVhc2VzIHNvIGRvZXMgdGhlIG51bWJlciBjdXJyZW50bHkgd29ya2luZy4NCmBgYHtyfQ0KZ2dwbG90KGRhdGEgPSBkZl9lZGFfM19jYXRlZ29yaWNhbCkgKw0KICAgICAgICBnZW9tX2JhcihtYXBwaW5nID0gYWVzKHg9RWR1Y2F0aW9uLCBmaWxsPVdvcmspLCBwb3NpdGlvbj0iZG9kZ2UiKQ0KYGBgDQoNCiMjIyMgRmluYWxseSwgSSBhbSBjcmVhdGluZyBhIGNvdmFyaWF0aW9uIGJldHdlZW4gRWR1Y2F0aW9uIGFuZCBIb21lIE93bmVyc2hpcC4gV2UgY2FuIHNlZSB0aGF0IGFzIGVkdWNhdGlvbiBpbmNyZWFzZXMgbW9yZSBwZW9wbGUgb3duIGEgaG9tZSBhbmQgYWZ0ZXIgb2J0YWluaW5nIGEgY29sbGVnZSBkZWdyZWUgbGVzcyBwZW9wbGUgcmVudCBhbmQgZXZlbiBtb3JlIG93bi4NCmBgYHtyfQ0KZ2dwbG90KGRhdGEgPSBkZl9lZGFfM19jYXRlZ29yaWNhbCkgKw0KICAgICAgICBnZW9tX2JhcihtYXBwaW5nID0gYWVzKHg9RWR1Y2F0aW9uLCBmaWxsPUhvbWVPd24pLCBwb3NpdGlvbj0iZG9kZ2UiKQ0KYGBgDQoNCiMjIyAyLiAoOCBwdHMpIFNlbGVjdCB0d28gY29udGludW91cyB2YXJpYWJsZXMgZnJvbSB0aGUg4oCYTkhBTkVT4oCYIGRhdGEgdGhhdCBhcmUgbm90IGNvbnNpZGVyZWQgaW4gdGhlIGxlY3R1cmUgbm90ZXMuIEJlIGNyZWF0aXZlIGluIHByZXNlbnRpbmcgdmFyaWF0aW9uIGluIGVhY2ggYW5kIGNvdmFyaWF0aW9uIGluIGJldHdlZW4gdGhlbSBieSBjb25zaWRlcmluZyB0aGUgbGV2ZWxzIG9mIGF0IGxlYXN0IG9uZSBjYXRlZ29yaWNhbCB2YXJpYWJsZXMgdGhhdCB5b3UgcGlja2VkIGluICgxKS4NCmBgYHtyfQ0KZGZfZWRhXzJfY29udGludW91cyA8LSBzZWxlY3QoTkhBTkVTLCBEYXlzTWVudEhsdGhCYWQsIFNsZWVwSHJzTmlnaHQpDQpkZl9lZGFfMl9jb250aW51b3VzIDwtIGZpbHRlcihkZl9lZGFfMl9jb250aW51b3VzLCBEYXlzTWVudEhsdGhCYWQgIT0gIk5BIiwgU2xlZXBIcnNOaWdodCAhPSAiTkEiKQ0KZGZfZWRhXzJfY29udGludW91cw0KYGBgDQoNCiMjIyMgQmVsb3cgaXMgYSB2aXN1YWxpemF0aW9uIG9mIHRoZSB2YXJpYW5jZSBvZiB0aGUgZGF5cyB0aGV5IGhhZCBiYWQgbWVudGFsIGhlYWx0aC4gSSBzZXQgdGhlIGJhbmR3aWR0aCB0byA1IHNpbmNlIHRoZXJlIHdlcmUgYSBsb3Qgb2YgdmFsdWVzIGluIGJldHdlZW4gZWFjaCBzdGVwIHRoaXMga2VlcHMgaXQgY2xlYW5lci4gV2UgY2FuIHNlZSBieSB0aGUgYmVsb3cgdmlzdWFsaXphdGlvbiB0aGF0IG1vc3QgcGVvcGxlIGRpZG4ndCBoYXZlIGJhZCBtZW50YWwgaGVhbHRoLg0KYGBge3J9DQpnZ3Bsb3QoZGF0YSA9IGRmX2VkYV8yX2NvbnRpbnVvdXMpICsNCmdlb21faGlzdG9ncmFtKG1hcHBpbmcgPSBhZXMoeCA9IERheXNNZW50SGx0aEJhZCksIGJpbndpZHRoID0gNSApDQoNCmBgYA0KDQojIyMjIE5leHQgd2Ugd2lsbCBzaG93IHRoZSB2YXJpYW5jZSBvZiB0aGUgbnVtYmVyIG9mIHNsZWVwIGhvdXJzIHRoZSBwYXJ0aWNpcGFudHMgaGFkIGF0IG5pZ2h0LiBJIHNldCB0aGUgYmFuZHdpZHRoIHRvIG9uZSBzaW5jZSB0aGUgb2JzZXJ2YXRpb25zIHdlcmUgYmV0d2VlbiAwIGFuZCAxMyB3ZSB3b3VsZCBiZSBhYmxlIHRvIGVhc2lseSBzZWUgYWxsIGRhdGEuIFdlIGNhbiB0ZWxsIGZyb20gdGhlIGJlbG93IGRhdGEgdGhhdCBtb3N0IHBlb3BsZSBnZXQgYmV0d2VlbiA2IGFuZCA4IGhvdXJzIG9mIHNsZWVwLg0KYGBge3J9DQpnZ3Bsb3QoZGF0YSA9IGRmX2VkYV8yX2NvbnRpbnVvdXMpICsNCmdlb21faGlzdG9ncmFtKG1hcHBpbmcgPSBhZXMoeCA9IFNsZWVwSHJzTmlnaHQpLCBiaW53aWR0aCA9IDEpDQpgYGANCg0KIyMjIyBGb3IgdGhlIG5leHQgcGFydCBJIHdpbGwgY3JlYXRlIGEgZGF0YXNldCB3aXRoIGFsbCA1IGF0dHJpYnV0ZXMgSSBhbSB3b3JraW5nIHdpdGguDQpgYGB7cn0NCmRmX2VkYV81X2F0dHJpYnV0ZXMgPC0gc2VsZWN0KE5IQU5FUywgRWR1Y2F0aW9uLCBIb21lT3duLCBXb3JrLCBEYXlzTWVudEhsdGhCYWQsIFNsZWVwSHJzTmlnaHQpDQpkZl9lZGFfNV9hdHRyaWJ1dGVzIDwtIGZpbHRlcihkZl9lZGFfNV9hdHRyaWJ1dGVzLCBFZHVjYXRpb24gIT0gIk5BIiwgSG9tZU93biAhPSAiTkEiLCBXb3JrICE9ICJOQSIsDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICBEYXlzTWVudEhsdGhCYWQgIT0gIk5BIiwgU2xlZXBIcnNOaWdodCAhPSAiTkEiKQ0KZGZfZWRhXzVfYXR0cmlidXRlcw0KYGBgDQoNCg0KIyMjIyBOZXh0IEkgd2lsbCBzaG93IGEgY292YXJpYW5jZSBvZiB3aGF0IGVmZmVjdCBlZHVjYXRpb24gaGFzIG9uIGRheXMgb2YgYmFkIG1lbnRhbCBoZWFsdGguIFdlIGNhbiB0ZWxsIHRoZXJlIGlzbid0IGEgbGFyZ2UgZWZmZWN0IG90aGVyIHRoYW4gaGF2aW5nIGEgY29sbGVnZSBkZWdyZWUgc2VlbXMgdG8gbGVzc29uIGl0IHNvbWUuIEhvd2V2ZXIgdGhlcmUgaXMgbW9yZSBiYWQgbWVudGFsIGhlYWx0aCB3aXRoIHRob3NlIHRoYXQgZHJvcHBlZA0KYGBge3J9DQpnZ3Bsb3QoZGF0YSA9IGRmX2VkYV81X2F0dHJpYnV0ZXMsIG1hcHBpbmcgPSBhZXMoeCA9IEVkdWNhdGlvbiwgeSA9IERheXNNZW50SGx0aEJhZCkpICArDQogIGdlb21fYm94cGxvdCgpDQpgYGANCg0KIyMjIEJlbG93IGlzIGEgdGFibGUgb2YgdGhlIGRlc2NyaXB0aXZlIHN0YXRpc3RpY3MgZm9yIGJhZCBtZW50YWwgaGVhbHRoIGluIHJlZ2FyZCB0byBlZHVjYXRpb24gYW5kIHdvcmsuDQpgYGB7cn0NCmRmX2VkYV81X2F0dHJpYnV0ZXNfYmFkX21lbnRhbF9oZWFsdGhfc3RhdHMgPC0gZGZfZWRhXzVfYXR0cmlidXRlcyAlPiUNCiAgZmlsdGVyKCFpcy5uYShEYXlzTWVudEhsdGhCYWQpKSAlPiUNCiAgZ3JvdXBfYnkoRWR1Y2F0aW9uLCBXb3JrKSAlPiUNCiAgc3VtbWFyaXNlKG1lYW4gPSBtZWFuKERheXNNZW50SGx0aEJhZCksIHN0ZGV2ID0gc2QoRGF5c01lbnRIbHRoQmFkKSwgTiA9IG4oKSkgJT4lDQogICAgICAgIHVuZ3JvdXAoKSAlPiUNCiAgICAgICAgcGl2b3Rfd2lkZXIobmFtZXNfZnJvbSA9IFdvcmssIHZhbHVlc19mcm9tID0gYyhtZWFuLCBzdGRldiwgTikpICU+JQ0KICAgICAgICBzZWxlY3QoRWR1Y2F0aW9uLCsgZW5kc193aXRoKCJDb2xsZWdlIEdyYWQiKSwgZXZlcnl0aGluZygpKQ0KDQprYWJsZShkZl9lZGFfNV9hdHRyaWJ1dGVzX2JhZF9tZW50YWxfaGVhbHRoX3N0YXRzLA0KICAgICAgY2FwdGlvbiA9ICJEZXNjcmlwdGl2ZSBTdGF0cyBmb3IgQmFkIE1lbnRhbCBIZWFsdGggdnMgRWR1Y2F0aW9uIGFuZCBXb3JrIFN0YXR1cyIsDQogICAgICBlc2NhcGUgPSBGLA0KICAgICAgZGlnaXRzID0gMywNCiAgICAgIGxvbmd0YWJsZSA9IFQsDQogICAgICBjb2wubmFtZXMgPSBjKCJFZHVjYXRpb24iLCAiTWVhbiIsICJTdC4gZGV2aWF0aW9uIiwgIk4iLCAiTWVhbiIsICJTdC4gZGV2aWF0aW9uIiwgIk4iLA0KICAgICAgICAgICAgICAgICAgICAiTWVhbiIsICJTdC4gZGV2aWF0aW9uIiwgIk4iKSkgJT4lDQogICAgICAgIGFkZF9oZWFkZXJfYWJvdmUoYygiICIgPSAxLCAiTG9va2luZyIgPSAzLCAiV29ya2luZyIgPSAzLCAiTm90IFdvcmtpbmciID0gMykpDQpgYGANCg0K